feat: Add AmortizedVIPosterior for amortized variational inference#1751
feat: Add AmortizedVIPosterior for amortized variational inference#1751
Conversation
…ing in training; add gradient flow test
…curacy comparison
…r; update references in implementation and tests
- Reuse Zuko flow enum, - align MAP with potential-based logic, - tighten sampling/validation behavior while updating tests and docs.
- Fix VIPosterior.to() to return self for method chaining (matches AmortizedVIPosterior) - Add theta dimension validation in AmortizedVIPosterior.train() to catch mismatches early - Remove ZukoFlowType from top-level sbi.inference exports (still available via sbi.inference.posteriors) - Update test imports accordingly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
❌ 15 Tests Failed:
View the top 3 failed test(s) by shortest run time
View the full list of 9 ❄️ flaky test(s)
To view more test analytics, go to the Test Analytics Dashboard |
manuelgloeckler
left a comment
There was a problem hiding this comment.
Hey Jan,
thanks for implementing this. I left a few comments on the tests and __init__ changes.
I wonder if its somewhat easy to connect the old VIPosterior parts with the new AmortizedVIPosterior a bit better. But this would require quite a few changes to the vi parts I think, so also happy to do the amortized VI separately.
Add `build_zuko_vi_flow` function that creates unconditional Zuko normalizing flows for variational inference training. Supports: - NSF (Neural Spline Flow) - MAF (Masked Autoregressive Flow) - Gaussian (full covariance) - Gaussian diagonal Also includes helper `_build_zuko_gaussian_flow` with custom affine transforms (diagonal and lower triangular) for Gaussian variants. This addresses Phase 1, Step 1.1 of the VI unification plan. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update DivergenceOptimizer to handle both Pyro TransformedDistribution and ZukoUnconditionalFlow variational distributions: - Add VariationalDistribution type alias for Union of flow types - Detect flow type via isinstance check for ZukoUnconditionalFlow - Handle Pyro-specific set_default_validate_args conditionally - Properly register Zuko nn.Module flows in ModuleList Part of VI unification (Phase 2): adapting existing VI infrastructure to work with new Zuko-based flows alongside legacy Pyro flows. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 3 of VI unification: Extend Zuko flow support to all divergence optimizer subclasses (ElboOptimizer, ForwardKLOptimizer, RenyiDivergenceOptimizer). Changes: - Update warmup() to support Zuko flows using sample_and_log_prob - Add clear error for unsupported 'identity' warmup with Zuko flows - Fix missing 'raise' in NotImplementedError for invalid warmup methods - Update _loss() methods to check _is_zuko flag for reparameterized sampling - Refactor elbo_particles() to handle both Zuko and Pyro flow APIs - Fix typo: 'inital_target' -> 'initial_target' 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix duplicate __all__ exports in sbi/inference/__init__.py - Remove incorrect ZukoFlowType export from posteriors/__init__.py - Add thread-safety lock to AmortizedVIPosterior.set_x() - Fix missing comma in VIPosterior progress bar display 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Phase 3 Steps 3.1-3.3 of VI unification: - Add _mode tracking attribute for single_x vs amortized modes - Add _build_zuko_flow helper method for building Zuko flows - Update set_q to use Zuko for maf/nsf/mcf/scf flow types - Keep Pyro flows for gaussian/gaussian_diag for backwards compat - Add ZukoUnconditionalFlow validation in set_q - Fix undefined transforms bug in build_zuko_unconditional_flow The train(x_o) signature remains unchanged. DivergenceOptimizer (adapted in Phase 2) handles both Pyro and Zuko flows. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…rence Add amortized VI support to VIPosterior: - New train_amortized(theta, x) method for training conditional flows q(θ|x) - Updated sample() and log_prob() to handle both single-x and amortized modes - Added _build_conditional_flow() helper using Zuko conditional flows - Mode tracking with warnings when switching between modes - Thread-safety lock for potential_fn state during ELBO computation - Validation-based early stopping for amortized training 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix deepcopy/pickle of VIPosterior with _set_x_lock attribute: - __deepcopy__: Create new Lock instead of attempting deepcopy - __getstate__: Pop _set_x_lock from state dict (not picklable) - __setstate__: Restore lock immediately after restoring __dict__ - Update tests for Zuko flow compatibility: - Use sample() instead of rsample() for Zuko flows (which don't have rsample) - Skip .support attribute check for Zuko flows (they don't expose it) - Fix typo in docstring: "due not support" -> "do not support" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update VIPosterior.sample() to properly handle batched x in amortized mode - Preserve batch dimension for multi-observation inputs - Squeeze singleton batch dimension to match base posterior behavior - Support default_x via _x_else_default_x() - Implement sample_batched() for amortized mode (delegates to sample()) - Improve log_prob() docstring with batched x documentation - Standardize error types: use ValueError instead of AttributeError - Migrate all tests from AmortizedVIPosterior to VIPosterior.train_amortized() - Update imports and constructor calls - Change .train() to .train_amortized() with flow params - Update assertions for new mode attribute 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Remove AmortizedVIPosterior from __all__ exports in both sbi/inference/__init__.py and sbi/inference/posteriors/__init__.py. All functionality has been migrated to the unified VIPosterior class with its train_amortized() method. Note: The amortized_vi_posterior.py file still exists pending manual deletion approval. The vi_pyro_flows.py file is retained as it's still needed for gaussian/gaussian_diag flow types. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Thanks for the review @manuelgloeckler! As discussed offline, I made major updates to this PR to combine both single-x VI and amortized VI into one posterior class. AI Usage: As I am experimenting with different AI coding strategies at the moment, I tried setting up a detailed "product requirement file" summarizing our goal and then set up Claude Code in a loop (similar to the RALPH setting) that runs over night and works on one step at at time, does a self-review, writes the progress into a file and then starts from scratch again with new context. Therefore, there are many commits co-authored by Claude. It worked reasonably well, but not perfect. Claude failed fixing an issue SummaryUnified APIInstead of a separate AmortizedVIPosterior class, we now have a single VIPosterior with two training modes:
# Single-x mode (unchanged)
posterior = VIPosterior(potential_fn, prior)
posterior.train()
samples = posterior.sample((1000,))
# Amortized mode
posterior = VIPosterior(potential_fn, prior)
posterior.train_amortized(theta, x, flow_type=ZukoFlowType.NSF)
samples = posterior.sample((1000,), x=x_new) # works for any xI agree to your points on the tests:
|
617d5a4 to
8b48387
Compare
Resolved conflicts: - vi_posterior.py: Combined docstrings for sample() Args - vi_test.py: Kept sampling_method parameterization for comprehensive testing - vi_test.py: Removed duplicate assertion for K parameter 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
This PR implements amortized variational inference via a new
AmortizedVIPosteriorclass, addressing #909.Note on AI usage: I used Claude Code to help implementing this. I did many iterations and careful reviewing myself and with a very critical Codex 5.2 reviewer.
Context
The existing
VIPosteriortrains an unconditional variational distributionq(θ)for a fixed observationx_o. This requires retraining for every new observation, which is inefficient in scenarios requiring inference across many observations.Amortized VI addresses this by learning a conditional distribution
q(θ|x)that generalizes across observations. Once trained on simulation data(θ, x), the posterior can provide instant samples for any newxwithout retraining.As part of this PR, we also align MAP estimation with the base posterior logic (potential-based) and keep sampling output shapes consistent across posteriors.
Implementation
We introduce
AmortizedVIPosterior, which trains a conditional normalizing flowq(θ|x)by optimizing the ELBO against a potential function from NLE/NRE:A
ZukoFlowTypeenum provides type-safe selection of flow architectures (NSF, MAF, NAF, UNAF, SOSPF, NICE, GF, NCSF, BPF).Design Choices
Separate class vs extending VIPosterior
We created a new class rather than adding an amortized mode to
VIPosterior:train()vstrain(theta, x))VIPosteriorZukoFlowType enum scoping
The
ZukoFlowTypeenum is defined insbi.neural_nets.factoryand reused here:log_prob(NSF, MAF, NAF, UNAF, SOSPF, NICE, GF, NCSF, BPF)Potential function interface
The implementation uses the existing
potential_fn.set_x()pattern for efficiency. This makes the class non-thread-safe, which is documented in the class docstring.Testing
We validate correctness using a linear Gaussian problem where the true posterior is
analytically known. Tests verify:
VIPosterioron the same problemx, untrained model)Closes #909